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Abstract.  Traditional  simulation-based  applications  for  exploring  a  pa¬ 
rameter  space  to  understand  a  physical  phenomenon  or  to  optimize  a 
design  are  rapidly  overwhelmed  by  data  volume  when  large  numbers  of 
simulations  of  different  parameters  are  carried  out.  Optimizing  reservoir 
management  through  simulation-based  studies,  in  which  large  numbers 
of  realizations  are  sought  using  detailed  geologic  descriptions,  is  an  ex¬ 
ample  of  such  applications.  In  this  paper,  we  describe  a  software  archi¬ 
tecture  to  facilitate  large  scale  simulation  studies,  involving  ensembles 
of  long-running  simulations  and  analysis  of  vast  volumes  of  output  data. 

This  architecture  is  built  on  top  of  two  frameworks  we  have  developed: 

IPARS  and  DataCutter.  These  frameworks  make  it  possible  to  imple¬ 
ment  tools  and  applications  to  run  large-scale  simulations,  and  generate 
and  investigate  terabyte-scale  datasets  efficiently. 


1  Introduction 

Numerical  simulations  provide  a  powerful  mechanism  for  investigating  and  un¬ 
derstanding  complex  systems  and  the  interactions  between  various  entities  in 
those  systems,  and  for  effectively  exploring  design  alternatives.  In  such  applica¬ 
tions,  a  large  ensemble  of  simulations  are  carried  out  using  different  parameter 
values  that  descripe  different  potential  initial  states  of  the  complex  system  under 
study.  These  applications  are  highly  data-driven.  Choosing  the  next  set  of  sim¬ 
ulations  to  be  performed  requires  analysis  of  data  from  earlier  simulations.  As 
high-performance  parallel  and  distributed  platforms  become  more  ubiquitous, 
traditional  simulation  approaches  are  overwhelmed  by  the  vast  volumes  of  data 
that  need  to  be  queried  and  analyzed.  In  this  paper,  using  oil  reservoir  man¬ 
agement  applications  as  an  example,  we  describe  the  design  and  a  prototype 
implementation  of  a  software  system  for  large  scale  simulation  studies. 


2 


Joel  Saltz  et  al. 


Fig.  1.  Software  system  architecture 

Numerical  simulation  of  oil  and  gas  reservoirs  can  aid  the  design  and  im¬ 
plementation  of  optimal  production  strategies.  With  a  better  understanding  of 
oil  and  gas  produced  from  existing  reservoirs,  better  techniques  can  be  devised 
to  locate  new  reserves  and  maximize  oil  production  from  the  existing  reserves. 
Complex  models  of  subsurface  allow  better  assessment  of  the  risk  to  the  environ¬ 
ment  of  existing  and  new  reservoirs  and  remediation  and  storage  of  hazardous 
waste. 

Despite  technological  advances  in  methods  of  determining  reservoir  proper¬ 
ties,  operators  still  have  at  best  a  partial  knowledge  of  critical  parameters  such 
as  rock  permeability  which  govern  production  rates.  Thus  a  major  challenge  to 
these  objectives  is  incorporating  geologic  uncertainty  while  maintaining  opera¬ 
tional  flexibility  in  large,  detailed  flow  models.  One  approach  to  this  problem  is 
to  simulate  alternative  production  strategies  (number,  type,  timing  and  location 
of  wells)  applied  to  multiple  realizations  of  multiple  geostatistical  models.  In  a 
typical  study,  a  scientist  runs  an  ensemble  of  simulations  to  study  the  effects 
of  varying  oil  reservoir  properties  (e.g.,  permeability,  oil/water  ratio,  etc.)  over 
a  long  period  of  time.  With  the  help  of  high-performance  computers,  even  for 
relatively  coarse  descriptions,  this  approach  can  lead  to  unmanageably  large  vol¬ 
umes  of  output  data.  Storage,  analysis  and  visualization  of  large  volumes  of  data 
generated  by  an  ensemble  of  simulations  is  key  to  achieve  a  better  understanding 
and  characterization  of  oil  reservoirs. 

Figure  1  illustrates  the  overall  architecture  of  the  software  system  for  large 
scale  oil  reservoir  simulation  studies.  The  system  consists  of  two  main  frameworks 
that  we  have  developed. 

—  IPARS  (Integrated  Parallel  Accurate  Reservoir  Simulator)  is  a 

framework  that  supports  multiple  physical  models  and  algorithms  for  the 
solution  of  multiphase  flow  and  transport  problems  in  porous  media.  The 
framework  provides  common  memory  management  for  general  geometric 
grids,  portable  parallel  communication,  linear  solvers  with  state-of-the-art 
preconditioners,  keyword  input,  and  output  with  visualization. 
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—  DataCutter  is  a  middleware  framework  for  subsetting  and  processing  multi¬ 
dimensional  datasets  in  a  distributed  environment.  The  application  process¬ 
ing  structure  is  implemented  as  a  set  of  interacting  components,  referred  to 
as  filters.  Application  filters  can  be  placed  on  the  machines  in  a  setting  so 
that  communication  and  computation  overheads  are  minimized. 

In  this  architecture,  IPARS  is  used  to  simulate  alternative  production  strate¬ 
gies  for  a  large  number  of  geostatistical  realizations  of  a  hypothetical  reservoir. 
Output  from  IPARS  is  stored  on  distributed  collections  of  disk-based  storage 
systems  for  interactive  data  analysis.  Using  the  DataCutter  framework,  various 
data  analysis  operations  can  be  implemented  that  query  and  manipulate  those 
datasets.  These  operations  can  be  executed  on  the  storage  systems  where  the 
datasets  are  stored  or  on  other  machines  dispersed  across  a  network.  A  graph¬ 
ical  user  interface  allows  a  scientist  to  formulate  queries  to  carry  out  different 
analysis  scenarios,  such  as  economic  ranking  of  the  alternatives,  exploration  of 
the  physical  basis  for  differences  in  behavior  between  realizations,  in  particular 
to  identify  regions  of  bypassed  oil,  and  identification  of  representative  realiza¬ 
tions,  which  could  be  used  as  a  basis  for  further  optimization.  Visualization  of 
the  datasets  can  be  carried  out  remotely. 

In  the  rest  of  the  paper,  we  present  the  components  of  the  software  system  in 
more  detail.  We  describe  the  implementation  of  several  data  analysis  scenarios 
for  a  sample  case  study. 

2  System  Components 

2.1  IPARS  Framework 

IPARS  [6,18]  represents  a  new  approach  to  reservoir  simulator  development, 
emphasizing  modularity  of  code,  portability  to  many  platforms,  and  ease  of  in¬ 
tegration  with  other  software.  It  models  multiphase,  multiphysics  flow  in  porous 
media,  and  is  suitable  for  massively  parallel  computers  or  clusters  of  worksta¬ 
tions.  There  are  currently  ten  physical  models  in  IPARS,  including  multiphase 
gas-oil-water  and  air-water  flow  and  reactive  transport  models.  These  models 
can  be  coupled  for  multiphysics  simulations  including  couplings  between  IPARS 
models  or  with  external  codes.  For  example,  the  IPARS  black-oil  model  was  used 
in  a  loosely  coupled  geomechanics  and  flow  implementation  driven  by  reservoir 
subsidence  problems.  Solvers  used  by  IPARS  employ  state-of-the-art  techniques 
for  nonlinear  and  linear  problems  including  multigrid  and  other  preconditioners. 

A  key  feature  of  the  IPARS  framework  is  that  it  explicitly  builds  upon  the 
multiblock  multiphysics  approach  ([14, 18, 10])  which  allows  for  the  mathemati¬ 
cally  rigorous  treatment  of  multiple  domains  in  which  different  physical  processes 
are  occurring,  as  well  as  providing  a  basis  for  implementing  different  numerical 
schemes  in  different  parts  of  the  domain  using  nonmatching  grids  while  preserv¬ 
ing  mass  and  momentum  conservation. 

The  black-oil  model  implemented  in  IPARS  is  a  three  phase  (water,  oil  and 
gas)  model  describing  the  flow  in  a  petroleum  reservoir  [13]  with  three  compo¬ 
nents.  As  such  it  can  be  considered  as  a  subset  of  a  compositional  model  [13]. 
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Here  it  is  assumed  that  no  mass  transfer  occurs  between  the  water  phase  and  the 
other  two  phases  and  that  water  phase  can  be  identified  with  water  component. 
In  the  hydrocarbon  (oil-gas)  system,  only  two  components:  light  and  heavy  hy¬ 
drocarbons,  are  considered.  The  black-oil  model  described  here  has  been  shown 
[11]  to  give  accurate  results  by  comparing  them  with  available  analytical  so¬ 
lutions  and  in  other  cases  with  solutions  obtained  by  a  recognized  industrial 
reservoir  simulation  tool  Eclipse  [4].  In  addition,  the  models  and  solvers  under 
IPARS  Framework  were  shown  to  be  scalable  in  parallel  [17]. 


2.2  DataCutter 

A  number  of  toolkits  integrate  processing  with  parallel  data  retrieval  on  tightly- 
coupled  systems  [5, 3].  Component-based  frameworks  provide  an  viable  program¬ 
ming  environment  for  application  development  in  distributed  environments.  Be¬ 
sides  the  ease  of  complex  application  development,  such  models  facilitate  ap¬ 
plication  implementations  that  can  adapt  to  the  heterogeneous  and  dynamic 
nature  of  the  environment.  Several  research  projects  have  focused  on  developing 
different  types  of  component-based  models  [7, 12, 15]. 

DataCutter  [2, 1]  is  a  component  framework  designed  to  support  subsetting 
and  processing  of  large  datasets.  DataCutter  implements  a  filter-stream  pro¬ 
gramming  model  for  developing  data-intensive  applications.  In  this  model,  the 
application  processing  structure  is  implemented  as  a  set  of  components,  referred 
to  as  filters,  that  exchange  data  through  a  stream  abstraction.  The  interface  for 
a  filter,  consists  of  three  functions:  (1)  an  initialization  function  ( init ),  in  which 
any  required  resources  such  as  memory  for  data  structures  are  allocated  and  ini¬ 
tialized,  (2)  a  processing  function  {process),  in  which  user-defined  operations  are 
applied  on  data  elements,  and  (3)  a  finalization  function  (finalize),  in  which  the 
resources  allocated  in  init  are  released.  Filters  are  connected  via  logical  streams. 
A  stream  denotes  a  uni-directional  data  flow  from  one  filter  (i.e.,  the  producer) 
to  another  (i.e.,  the  consumer).  A  filter  is  required  to  read  data  from  its  input 
streams  and  write  data  to  its  output  streams  only.  We  define  a  data  buffer  as 
an  array  of  data  elements  transferred  from  one  filter  to  another.  The  current 
implementation  of  the  logical  stream  delivers  data  in  fixed  size  buffers,  and  uses 
TCP  for  point-to-point  stream  communication. 

The  overall  processing  structure  of  an  application  is  realized  by  a  filter  group, 
which  is  a  set  of  filters  connected  through  logical  streams.  When  a  filter  group 
is  instantiated  to  process  an  application  query,  the  runtime  system  establishes 
TCP/IP  socket  connections  between  filters  placed  on  different  hosts  before  start¬ 
ing  the  execution  of  the  application  query.  Filters  placed  on  the  same  host  ex¬ 
ecute  as  separate  threads.  An  application  query  is  handled  as  a  unit  of  work 
(UOW)  by  the  filter  group. 

The  programming  model  provides  several  abstractions  to  facilitate  perfor¬ 
mance  optimizations.  A  transparent  filter  copy  is  a  copy  of  a  filter  in  a  filter 
group.  The  filter  copy  is  transparent  in  the  sense  that  it  shares  the  same  logical 
input  and  output  streams  of  the  original  filter.  A  transparent  copy  of  a  filter  can 
be  made  if  the  semantics  of  the  filter  group  are  not  affected.  That  is,  the  output 
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of  a  unit  of  work  should  be  the  same,  regardless  of  the  number  of  transparent 
copies.  The  transparent  copies  enable  data-parallelism  for  execution  of  a  single 
query,  while  multiple  filter  groups  allow  concurrency  among  multiple  queries. 
The  filter  runtime  system  maintains  the  illusion  of  a  single  logical  point-to-point 
stream  for  communication  between  a  logical  producer  filter  and  a  logical  con¬ 
sumer  filter.  For  distribution  between  transparent  copies,  the  runtime  system 
supports  a  Round-Robin  (RR)  mechanism  and  a  Demand  Driven  (DD)  mech¬ 
anism  based  on  the  buffer  consumption  rate.  DD  aims  to  send  buffers  to  the 
filter  that  would  process  them  fastest.  When  a  consumer  filter  starts  processing 
of  a  buffer  received  from  a  producer  filter,  it  sends  an  acknowledgment  message 
to  the  producer  filter  to  indicate  that  the  buffer  is  being  processed.  A  producer 
filter  chooses  the  consumer  filter  with  the  minimum  number  of  unacknowledged 
buffers  to  send  a  data  buffer  to,  thus  achieving  a  better  balancing  of  the  load. 

3  A  Case  Study 

In  this  section,  we  describe  a  case  study  we  have  implemented  using  IPARS  and 
DataCutter.  This  case  study  involves  generation  of  a  large  collection  of  data 
sets  from  IPARS  simulations,  and  implementation  and  execution  of  various  data 
exploration  scenarios. 


3.1  Data  Generation 

We  have  generated  a  large  dataset  consisting  of  207  separate  realizations  using 
the  IPARS  simulation  framework.  The  input  data  for  this  dataset  is  based  on  the 
industry  benchmark  SPE9  problem[8]  and  comes  from  a  black-oil  (three  phase) 
flow  problem  on  a  grid  with  9,000  cells.  At  each  time  step,  the  value  of  seventeen 
separate  variables  is  output  for  each  node  in  the  grid.  A  total  of  10,000  time  steps 
are  taken  and  the  total  output  stored  for  each  realization  is  about  6.9  GB.  The 
total  of  207  realizations  were  taken  from  among  18  geostatistical  models  and 
4  well  configurations/production  scenarios.  The  geostatistical  models  are  used 
to  randomly  generate  permeability  fields  that  are  characterized  by  statistical 
parameters  such  as  covariance  and  correlation  length.  The  total  dataset  size  is 
roughly  1.5  Terabytes5  and  was  generated  and  stored  on  a  storage  cluster  of 
50  Linux  nodes  (PIII-650,  128MB,  Switched  Fast  Ethernet)  with  a  total  disk 
storage  of  9TB. 


3.2  Data  Exploration  Scenarios 

We  have  implemented  several  data  exploration  scenarios  using  the  DataCutter 
framework.  These  scenarios  involve  user-defined  queries  for  economic  evaluation 

5  The  case  study  that  is  described  in  this  paper  was  demonstrated  at  Supercomputing 
2001;  a  study  with  a  larger  dataset  (5  Terabytes),  which  is  distributed  across  three 
sites  (San  Diego  Supercomputer  Center,  University  of  Maryland,  and  Ohio  State 
University),  was  demonstrated  at  Supercomputing  2002.  We  plan  to  report  on  the 
performance  evaluation  of  the  latter  study  in  a  future  work. 
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as  well  as  technical  evaluation,  such  as  determination  of  representative  realiza¬ 
tions  and  identification  of  areas  of  bypassed  oil. 


Economic  Evaluation  In  the  optimization  of  oil  field  production  strategies, 
the  objective  function  to  be  maximized  is  the  resulting  economic  value  of  a  given 
production  strategy.  The  value  can  be  measured  in  a  variety  of  ways.  In  our  model 
we  compute  both  the  net  present  value  (NPV)  and  return  on  investment  (ROI). 
In  the  computation  of  the  NPV  for  a  given  realization,  a  query  integrates  over 
time  the  revenue  from  produced  oil  and  gas,  and  the  expenses  from  water  injec¬ 
tion  and  production,  accounting  for  the  time  value  of  the  resources  produced. 
This  calculation  is  performed  for  a  subset  of  the  realizations  chosen  by  the  user. 
As  all  of  the  well  production  and  injection  data  for  each  realization  resides  in  a 
single  file  in  a  single  disk,  the  data  access  pattern  for  this  application  is  relatively 
simple,  and  most  computation  time  is  spent  parsing  the  output  file.  The  well 
data  is  also  a  relatively  small  part  of  the  output  data  at  each  time  step,  so  this 
is  not  a  compute  and  data  intensive  computation.  Presently,  the  operations  for 
the  economic  evaluation  is  implemented  as  a  single  DataCutter  filter,  which  also 
performs  data  retrieval  from  the  storage  system. 


Bypassed  Oil  Depending  on  the  distribution  of  reservoir  permeability  and 
the  production  strategy  employed,  it  is  possible  for  oil  to  remain  unextracted 
from  certain  regions  in  the  reservoir.  To  optimize  the  production  strategy,  it  is 
useful  to  know  the  location  and  size  of  these  regions  of  bypassed  oil.  To  locate 
these  regions,  the  user  selects  a  subset  of  datasets  ( D ),  a  subset  of  time  steps 
(T),  minimum  oil  saturation  value  (Os,t0j),  maximum  oil  velocity  ( V0itoi ),  and 
minimum  number  of  connected  grid  cells  (Nc)  for  a  bypassed  oil  pocket.  The 
goal  is  to  find  all  the  datasets  in  D  that  have  bypassed  oil  pockets  with  at  least 
Nc  grid  cells.  A  cell  (C)  is  a  potential  bypassed  oil  cell  if  S0mC  >  S0j0i  and 
V0,c  ^  Vo,tol. 

We  implemented  a  set  of  filters  that  carry  out  the  various  operations  required 
to  find  the  bypassed  oil  regions.  The  implementation  consists  of  three  filters.  RD 
—  Read  data  filter  retrieves  the  data  of  interest  from  disk  and  writes  the  data 
to  its  output  stream.  A  data  buffer  in  the  output  stream  contains  oil  velocity 
and  oil  saturation  values,  and  corresponds  to  a  portion  of  the  grid  at  a  time  step 
in  a  data  set.  CC  —  Connected  component  filter  performs  operations  to  find 
bypassed  oil  pockets  at  a  time  step  on  data  buffer  received  from  RD.  These  oil 
pockets  are  stored  in  a  byte  array,  passed  to  the  next  filter  in  the  pipeline.  Each 
entry  of  the  byte  array  denotes  a  grid  cell  and  stores  if  the  cell  is  bypassed  oil 
cell  or  not.  The  CC  filter  writes  the  data  buffer  for  each  time  step  to  the  output 
stream,  which  connects  CC  to  the  MT  filter.  MT  —  Merge  over  time  filter 
performs  an  AND  operation  on  the  data  buffers  received  from  CC,  and  finds  the 
bypassed  oil  pockets.  The  result  is  sent  to  the  client. 

This  scenario  accesses  the  large  four-dimensional  (three  spatial  dimensions 
and  time)  datasets  which  are  output  for  each  realization.  Each  of  the  output 
variables  are  written  to  separate  files,  so  this  computation  involves  the  subsetting 
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of  data  spread  across  several  files.  Additionally,  if  the  simulation  was  run  in 
parallel,  the  data  for  different  parts  of  the  domain  could  reside  on  separate  disks 
or  nodes. 


Representative  Realization  Running  multiple  realizations  with  the  same  geo- 
statistical  model  and  well  configurations  can  give  an  idea  of  the  upper  and  lower 
bounds  of  performance  for  a  particular  production  strategy.  It  is  also  of  interest 
to  find  one  realization  for  a  given  production  scenario  that  best  represents  the 
average  or  expected  behavior.  A  client  query  to  find  a  representative  realization 
for  a  given  subset  of  realizations  computes  the  average  of  the  primary  IPARS 
unknowns  -  oil  concentration  (■ C0 ),  water  pressure  ( Wp ),  gas  pressure  (Gp)  - 
and  then  finds  the  realization  in  the  subset  which  is  closest  to  the  average  in  the 
sense  that 


min 

all  grid  points 


\C0  ~  C0_a 


c, 


+ 


o-avg 


\Pw  Pw-a 


w-avg 


+ 


I  Pg  Pg-a 


g-avg 


is  realized. 

The  DataCutter  implementation  consists  of  four  filters.  RD  -  Read  filter 
retrieves  the  data  of  interest  from  disk.  The  read  filter  sends  data  from  each 
dataset  to  the  SUM  and  DIFF  filters.  A  data  buffer  from  the  read  filter  is  a 
portion  of  the  grid  at  one  time  step.  SUM  -  Sum  filter  computes  the  sum 
for  C0,  Wp,  and  Gv  variables  at  each  grid  point  across  the  datasets  selected  by 
the  user.  AVG  -  Average  filter  calculates  the  average  for  C0,  Wp,  and  Gp 
values.  DIFF  -  Difference  filter  finds  the  sum  of  the  differences  between  the 
grid  values  and  the  average  values  for  each  dataset.  It  sends  the  difference  to  the 
client,  which  keeps  track  of  differences  for  each  time  step,  carries  out  average 
over  all  time  steps  for  each  dataset. 


3.3  Visualization 

We  have  developed  two  different  implementations  for  visualizing  output  from 
a  realization.  The  first  visualization  employs  isosurface  rendering  implemented 
using  the  Active  Data  Repository  (ADR)  framework  [3],  which  is  designed  to 
support  processing  of  large,  out-of-core  datasets  via  generalized  reduction  op¬ 
erations  on  distributed  memory  systems.  The  ADR  implementation  uses  the 
marching  cubes  and  polygon  rendering  functions  of  the  Visualization  Toolkit 
(VTK)  [16]  for  extracting  and  rendering  an  iso-surface  from  large,  out-of-core 
datasets  on  a  distributed-memory  parallel  system  with  a  local  disk  farm  [9]. 
Figure  2  shows  a  visualization  of  bypassed  oil  regions  using  the  isosurface  ren¬ 
dering.  The  second  visualization  tool  is  based  on  direct  volume  rendering.  We 
implemented  a  DataCutter  filter  using  the  volume  rendering  library,  developed 
at  Ohio  Supercomputer  Center.  This  implementation  uses  a  texture  based  vol¬ 
ume  rendering  approach  and  takes  advantage  of  3D  hardware  texture  rendering 
(e.g.,  NVIDIA  GeForce  3)  cards. 
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Fig.  2.  Visualization  of  bypassed  oil 


4  Results 


We  present  results  for  the  representative  realization  and  bypassed  oil  scenarios. 
The  experiments  were  carried  out  using  the  DataCutter  implementations  of  the 
two  scenarios  and  the  1.5TB  data  set  generated  in  this  study.  The  data  set  is 
stored  on  19  nodes  of  the  9.5TB  storage  cluster  at  University  of  Maryland.  In 
the  experiments,  we  varied  the  number  of  data  sets  accessed  by  a  query.  For 
this  purpose,  we  submitted  a  total  of  28  queries  (varying  the  number  of  datasets 
from  1  to  200);  14  queries  for  the  bypassed  oil  analysis  and  14  queries  for  the 
representative  realization  analysis.  Half  of  the  queries  in  each  set  requests  data 
over  10  time  steps  (time  steps  0  through  9999,  with  increments  of  1000  time 
steps),  while  the  other  half  retrieves  25  time  steps  (time  steps  0  through  9999 
with  increments  of  400  time  steps). 

Figures  3  shows  the  execution  time  for  each  query.  As  is  seen  from  the  fig¬ 
ures,  up  to  40  datasets  the  query  execution  time  remains  below  1  seconds  for  the 
bypassed  oil  scenario.  As  the  number  of  datasets  is  increased  the  query  execu¬ 
tion  time  increases,  as  expected.  For  a  query  that  accesses  200  datasets  over  25 
time  steps,  the  execution  time  is  about  3  seconds.  Thus,  we  are  able  to  achieve 
interactive  rates  even  for  queries  that  access  a  large  number  of  datasets  from 
the  collection.  The  experimental  results  show  that  queries  for  representative  re¬ 
alization  scenario  take  longer,  as  the  operations  involved  are  more  expensive. 
As  is  seen  from  the  figure,  the  query  execution  time  remains  below  5  seconds 
for  queries  that  access  up  to  40  datasets  over  25  time  steps.  Our  preliminary 
results  show  that  the  query  execution  does  not  scale  well  after  40  datasets  for 
the  representative  realization  scenario.  This  is  because  of  the  fact  that  in  the 
experiments  the  number  of  transparent  copies  for  the  SUM  and  DIFF  filters  are 
fixed  at  four.  In  future  work,  we  plan  to  examine  the  effect  on  performance  of 
varying  the  number  of  transparent  copies. 
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Fig.  3.  (a)  Performance  results  for  bypassed  oil  computation  (b)  Performance  results 
for  representative  realization  computation 

5  Conclusions 


We  have  demonstrated  a  new  paradigm  for  applying  reservoir  simulation  to  the 
challenges  of  reservoir  management.  The  selected  challenge  was  to  enable  the 
evaluation  of  large  numbers  of  realizations,  both  of  geological  models  and  of  well 
patterns.  The  black  oil  model  within  the  IPARS  framework  provided  the  numer¬ 
ical  solutions  to  the  forward  flow  problems,  while  the  DataCutter  middleware 
provided  the  means  for  subsetting  and  filtering  the  multidimensional  output. 
The  volume  of  data  resulting  from  such  studies  can  be  extremely  large.  Such 
datasets  would  be  unmanageable  for  most  evaluation  tools,  especially  for  com¬ 
plex  queries  such  as  identifying  representative  realizations  or  locating  regions  of 
bypassed  oil.  The  IPARS /DataCutter  applications  enable  the  creation,  interro¬ 
gation  and  visualization  of  such  datasets  while  maintaining  the  familiarity  and 
speed  of  interaction  of  the  traditional  simulation  workflow.  Thus  many  more 
realizations  of  higher  resolution  geologic  models  and  more  production  strategies 
can  be  studied  in  greater  detail  within  a  given  time,  increasing  the  utility  of  the 
study  for  decision  making. 
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